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ABSTRACT 

Robot  control  systems  require  concurrent  architectures  to  overcome  real-time  constraints.  Satisfying  real-time  constraints  is  important, 
but  determinism,  not  performance,  is  the  key  element  in  assessing  a control  model.  The  goal  of  this  paper  is  to  present  the  concurrent 
hierarchical  control  model  as  an  efficient,  flexible,  and  deterministic  robot  control  methodology  and  then  systematically  analyze  the 
factors  affecting  the  design  and  performance  of  a concurrent  hierarchical  control  system.  The  ability  for  concurrent  hierarchical  control 
to  be  systematically  modeled  is  derived  from  the  generic  software  approach  to  all  levels  of  control.  The  generic  software  control  primitive 
of  the  model  is  the  virtual  control  loop.  A virtual  control  loop  is  a software  concept  analogous  to  a hardware  duty  cycle;  software 
cyclically  samples  commands  and  status  to  produce  outputs  within  a bounded  response  time. 

Virtual  control  with  bounded  response  times  forms  the  generic  software  component  of  each  level  in  the  concurrent  robot  control 
hierarchy.  Mathematical  formalism  can  be  applied  to  concurrent  hierarchical  control  yielding  a timing  model  that  includes  metrics  of 
system  response  time  and  level  response  time.  The  results  of  the  timing  analysis  can  be  applied  to  process  management  in  a robot  control 
system  to  yield  a breakdown  of  control  into  concurrent  processes:  a short-term  executor  ensuring  cyclic  response  and  a long-term  planner 
anticipating  the  future.  A software  control  algorithm  will  be  presented  that  combines  concurrent  executors  and  planners  to  form  the 
generic  level  in  the  robot  control  hierarchy. 


Key  Words  and  Phases  : communication,  hierarchical  controL,real-time,  response  time,  task  decomposition,  virtual  control 


1.  Introduction 

Multiprocessor  systems,  especially  those  built  of  relatively  low-cost  microprocessors,  offer  a cost-effective 
solution  to  the  performance  needs  of  an  intelligent  robot  controller.  Multiprocessor  systems  are  beneficial  for  many 
reasons,  including  cost  performance,  modular  growth,  reliability  through  replication,  and  flexibility  for  testing 
alternative  control  strategies  via  different  partitioning  [8,18].  The  effectiveness  of  a parallel  implementation  depends 
on  the  inherent  parallelism  of  the  algorithms  and  the  cost  of  interprocessor  communication.  Algorithms  that  are  purely 
sequential  will  not  run  faster  on  a parallel  machine.  Further,  the  cost  of  concurrent  communication  is  a larger  factor 
than  it  would  be  on  a sequential  machine  using  procedure  calls.  Interprocessor  communication  using  shared  memory 
or  message  passing  takes  longer  because  of  the  overhead  of  protocols  and  extra  synchronization.  The  cost  of 
communications  must  not  negate  the  time  savings  obtained  by  parallel  execution  on  the  different  processors.  Thus,  to 
achieve  the  benefits  of  parallelism,  a robot  control  model  must  be  inherently  concurrent  and  have  an  efficient  means 
of  communication. 


The  tuth ais  current  address  is  the  National  Institute  of  Standards  and  Technology,  Robotics  Systems  Division,  Bldg.  220  Rm.  B127,  Gaithersburg,  MD. 
20899  This  article  was  prepared  by  United  States  Government  employees  as  a part  of  their  official  duties  and  is  therefore  a work  of  the  U.S.  Government  and  not 
subject  to  copyright-  This  article  references  certain  commercial  equipment,  instruments  or  materials  and  such  identification  does  not  imply  recommendation  or 
endorsement  by  the  National  Institute  of  Standards  and  Technology  nor  does  it  imply  that  the  materials  or  equipment  identified  are  necessary  the  best  available  for 
the  propose. 
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Parallel  machines  cover  a broad  range  of  computer  architectures  but  several  characteristics  delineate  purpose. 
Parallel  architectures  vary  within  a fine-grain  to  coarse-grain  domain  [13,  15,  16].  General-purpose  coarse-grain 
machines  are  a parallel  computer  architecture  maximizing  batch  throughput  Here,  the  user  may  be  unaware  of  any 
parallelism  involved  Supercomputers  are  coarse-grained  machines  that  specialize  in  fine-grained  number  crunching 
in  parallel  for  mathematically  intensive  algorithms.  In  this  case,  the  programmer  may  add  some  data  flow 
enhancements  to  assist  the  number  crunching.  Other  fine-grain  computers  specialize  in  data- intensive  operations  that 
do  not  handle  general-computing  efficiently.  The  variety  in  machines  leads  to  the  distinction  between  systems  that 
maximize  the  throughput  of  many  jobs,  known  as  throughput-oriented  multiprocessors,  and  systems  that  maximize 
the  execution  of  one  process,  known  as  speedup-oriented  multiprocessors  [5].  A robot  controller  will  be  characterized 
as  a speedup-oriented  multiprocessing  application  since  the  controller  is  partitioned  into  a set  of  concurrent, 
cooperating  processes. 

This  paper  describes  the  applicability  of  the  concurrent  hierarchical  model  as  a multiprocessor  architecture  for 
a robot  control  system.  The  subsequent  sections  of  the  paper  are  organized  as  follows:  the  second  section  discusses 
the  concurrent  hierarchical  control  model  as  an  architecture  for  a multi-computer  robot  control  system.  The  theory  of 
concurrent  hierarchical  control  model  will  be  presented  and  include  a discussion  on  the  importance  of  task 
decomposition  and  real-time  response  time.  A software  algorithm  detailing  the  concurrent  hierarchical  model  that 
readily  ports  to  a multi-computer  system  will  be  included.  The  third  section  presents  a timing  analysis  of  a control 
system  based  on  the  concurrent  hierarchical  model.  The  fourth  section  discusses  an  implementation  of  a concurrent, 
hierarchical  robot  control  system  at  the  National  Institute  of  Standards  and  Technology  (NIST,  formerly  the  National 
Bureau  of  Standards)  as  applied  to  the  NASA  flight  telerobotic  servicer  and  contrasts  this  model  to  other  robot  control 
models. 

2.  Concurrent  Hierarchical  Real-Time  Control 

A goal  of  an  efficient  multiprocessor  system  is  to  exploit  the  benefits  of  parallelism  while  reducing  software 
complexity.  Hierarchical  structure  is  a well-defined  design  technique  often  applied  to  concurrent  robot  control  [1,4, 
5,  18,  21].  Hierarchical  structuring  of  a control  system  offers  an  easy  and  systematic  parallel  approach.  In  itself, 
hierarchical  structure  is  not  sufficient  to  defme  concurrency.  Task  decomposition  and  response  time  further  refine  the 
concurrent  hierarchical  model  to  handle  the  processor  management  of  a hierarchical  system,  the  complexity  of  any 
level,  and  communication  protocols. 

Task  decomposition  is  defined  as  the  process  of  recursively  partitioning  a task  into  smaller,  more  manageable 
subtasks.  Task  decomposition  is  complete  when  the  subtasks  are  atomic.  In  this  paper,  a subtask  partition  defines  a 
level  in  the  hierarchy.  At  each  level  in  the  hierarchical  breakdown,  an  interface  exists  through  which  the  adjoining 
levels  exchange  information.  As  applied  to  control  processes,  one  should  not  confuse  hierarchical  decomposition  with 
a deeply-nested  serial  process  that  follows  a single  thread  of  control  flow.  Although  levels  may  share  some  data  that 
models  the  world,  all  levels  run  concurrently,  and  each  can  be  defined  as  a virtual  control  loop. 

Virtual  control , as  applied  to  hierarchical  task  decomposition,  is  a software  control  strategy  that  consists  of  the 
following  periodic  actions:  1)  inputting  status  and  commands  from  adjoining  levels,  2)  producing  a goal-directed 
output  based  on  the  status  and  command,  and  3)  outputting  an  action  to  the  adjoining  lower  level  and  a status  to  the 
adjoining  upper  level.  One  iteration  through  the  virtual  control  loop  defines  a cycle.  When  executing,  each  virtual 
control  layer  in  the  hierarchy  can  be  considered  part  of  a long  chain  defining  the  hierarchical  state,  yet  each  layer 
independently  formulates  its  control  flow.  This  leads  to  the  definition  of  concurrent  hierarchical  control  as 
communicating  layers  of  virtual  control  loops.  Figure  1 illustrates  the  information  flow  in  the  concurrent  hierarchical 
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model. 


Figure  1.  Information  Flow  in  a Concurrent  Hierarchical  Control  System 

A concurrent  task  decomposition  model  does  not  complete  a definition  of  the  control  system.  Mere  functional 
correctness  is  not  sufficient.  Of  paramount  importance  is  measuring  the  system  performance  by  the  time  delay  required 
to  respond  (i.e.,  calculate  a solution.)  A real-time  system  must  satisfy  physical  time  constraints.  Adding  timing 
constraints  changes  the  mechanism  of  task  decomposition.  Now,  the  number  of  operations  performed  depends  heavily 
on  the  time  required  to  make  a decision  and  has  a direct  effect  on  decomposition.  Responding  to  an  event  too  late 
nullifies  the  control  no  matter  how  intelligent  the  subsequent  action.  This  timing  restriction  leads  to  the  standard 
definition  of  response  time  as  the  maximum  allowable  duration  between  an  event  and  an  action  resulting  from  the 
occurrence  of  that  event. 

Concurrency  between  levels  does  not  guarantee  a system  response  time.  Concurrency  within  a level  is  necessary 
to  ensure  predictable  system  behavior.  To  achieve  predictable  real-time  control,  each  level  in  the  hierarchy  must 
periodically  sample  inputs  and  produce  outputs.  Each  level  cannot  run  independently  of  adjoining  levels.  Should  one 
level  be  unable  to  complete  the  virtual  control  loop  within  one  cycle,  system  performance  is  no  longer  predictable.  For 
example,  a level  awaiting  a reading  from  a failed  sensor  could  loop  forever.  Concurrency  within  a level  can  be 
achieved  by  dividing  a control  level  to  account  for  present  and  future  actions.  The  planner  is  responsible  for  generating 
a plan  consisting  of  a series  of  actions.  The  planner  selects  the  best  plan  from  a candidate  list  of  alternative  plans  that 
achieve  the  commanded  goal,  given  the  current  state  of  the  environment.  An  executor  enables  state  transitions  and  so 
is  responsible  for  stepping  through  a generated  plan.  The  executor  must  run  periodically  to  guarantee  the  cyclic 
behavior  of  the  virtual  control  loop.  The  planner  and  executor  can  run  concurrently  within  a level,  thus  insuring  a 
predictable  level  response  which  implies  a predictable  system  response. 

The  following  algorithm  sketches  an  overview  of  planning  and  executing  within  a level  of  the  hierarchical 
concurrency  model.  The  algorithm  is  written  in  pseudo  code  using  Brinch  Hansen  primitives  cobegin  and  coend  for 
concurrent  execution  f 1 0],  It  is  important  to  note  that  the  concurrent  sections  of  code  could  be  either  tightly-  or  loosely- 
coupled  depending  on  timing  constraints.  (Processes  that  execute  in  parallel  on  separate  processors  are  defined  as 
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loosely-coupled.  Processes  that  execute  in  a multi-tasked  operating  system  on  a single  processor  are  defined  as  tightly- 
coupled.)  In  this  algorithm,  the  familiar  file  operations  lock  and  unlock  are  primitives  that  guarantee  mutual  exclusion. 
Comments  in  the  following  algorithm  will  be  delimited  by  double  quotes. 


procedure  level() 


cobegin 


repeat  "executor  section,  runs  at  a higher  priority' 


waitunxil(  next-cycle); 

update(  next -cycle ); 

lock;  read_command;  unlock; 

lock;  readjstatus;  unlock ; 

read  plan; 

process; 

lock ; write _command;  unlock; 
lock;  write _st at  us;  unlock; 
until  forever; 


"timing  synchronization  primitive 
"update  next  cycle  count" 

"read  command  " 

" read  status  " 

"read  current  plan" 

"execute  level  " 

"write  command  " 

"write  status  " 


repeat  "planner" 
plan; 

lock;  write _plan;  unlock; 
until  forever ; 


"do  level  planning" 
"update  executor  plan ' 


coend 

The  concurrency  of  the  planner  and  executor  allows  this  software  algorithm  to  be  easily  ported  onto  a parallel 
pipelined  computer  architecture.  Each  level  in  the  hierarchy  runs  as  a concurrent  process.  Parallelism  is  not  a direct 
result  of  concurrent  hierarchical  control  because  a lower  level  cannot  process  without  a command  from  its  adjoining 
higher  level.  However,  once  a task  is  underway,  this  creates  a pipeline  in  which  each  level  is  executing  in  parallel.  For 
example,  it  would  take  5 control  cycles  for  information  from  the  top  level  to  Filter  down  to  the  bottom  level  in  a five 
layer  hierarchy.  However,  this  cyclic  protocol  will  guarantee  a deterministic  system  response  time. 


3,  Timing  Analysis 

Systems  that  meet  response  time  obligations  are  defined  as  real-time  systems.  The  concept  of  response  time  or 
hard  real-time  must  be  contrasted  to  soft  time  used  as  a sequencer  of  events.  For  example,  a sequence  of  robot 
commands  GOTO  A,  GOTO  B , does  not  contain  any  timing  information.  Within  the  decomposition  of  this  command, 
only  the  lower  levels  are  concerned  about  timing  so  that  constant  updates  to  the  robot  are  guaranteed.  However,  the 
command  GOTO  A BY  f.  GOTO  B BY  t.+j  where  /•  and  t-+j  are  some  explicit  time  values,  demands  real  time 

handling.  Now,  a solution  must  be  supplied  within  an  explicit  time.  This  embodies  the  distinction  between  time  as  a 
dynamic  real-world  parameter  versus  time  as  a sequencing  tool. 

For  a hierarchical  control  system  to  be  real  time,  it  must  meet  the  demands  of  1)  the  response  time  of  the  system, 
plus  2)  the  response  time  of  each  level  in  the  hierarchy.  Response  time  will  be  assumed  to  be  an  input  of  the 
specification  and  therefore,  a constraint  of,  rather  than  the  output  of,  some  timing  analysis. 

3.1  System  Response  Time 

System  response  time  can  be  considered  in  two  ways.  One  perspective  is  to  measure  the  elapsed  time  a system 
takes  to  respond  to  a specific  discrete  event  Another  view  is  the  time  necessary  to  provide  a complete  solution.  In  a 
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computer  program,  consider  the  distinction  between  the  response  to  a keyboard  terminating  interrupt  as  opposed  to 
the  time  elapsed  until  the  program  has  completed.  Programs  that  fail  to  respond  to  a program  interruption  are  flawed. 
In  this  paper,  system  response  will  be  defined  as  the  amount  of  time  to  achieve  an  immediate  response. 


System  response  time  can  be  calculated  as  the  sum  of  the  worst  case  time  responses  as  commands  and  status 
filter  up  or  down  the  concurrent  hierarchy.  The  movement  of  command  and  status  can  be  characterized  as  follows. 
Let  R-  be  the  response  time  for  level  i.  If  n denotes  the  number  of  levels  employed  in  the  hierarchical  system  and  the 

maximum  response  time  of  a control  cycle  of  any  of  the  levels  is  R , then  if  the  ith  level  spends  t1  seconds 

max  c 

communicating/waiting  for  commands  and  status,  and  t1prQC  processing,  this  leads  to  the  following  minimum  timing 
constraint  for  each  cycle: 

< < ‘proc  + l‘c ) = Ri  > £ Rmax : for each  i=1  - "•  <3'» 

An  upper  bound  to  the  system  response  time  R§  can  be  calculated  as  follows: 


R < nR 
s max 


(3.2) 


Thus  R/n  defines  R__v  the  maximum  response  time  of  any  level  R 

o max  max 


No  level  can  exceed  this  time  limit 


and  still  guarantee  proper  system  communication  flow.  This  implies  a synchronization  signal  in  each  level  to 
periodically  communicate  (sample  command  and  status)  so  that  updated  information  can  be  trasnferred  between 
levels.  This  periodic  communication  prevents  any  level  from  processing  in  isolation  and  skewing  the  system  response 
time. 


The  rate  at  which  each  level  samples  new  commands  and  statuses  is  dependent  on  its  position  within  the 
hierarchy.  Assume  the  machine  update  rate  of  the  lowest  level  tied  to  the  robot  is  1 millisecond  per  cycle.  Imposing 
this  rate  as  Rmax  for  each  level  may  be  too  constraining.  A 1 millisecond  cycle  precludes  the  use  of  context 

switching.  All  higher  levels  may  want  to  sample  using  some  other  less-restrictive,  yet  bounded,  time  period.  This  leads 
to  a definition  of  R , system  response  time,  as  the  sum  of  R-  response  times  tied  to  the  machine  update  rate,  and  the 

sum  of  Rj  response  times  arbitrarily  set  to  meet  the  system  response  time  goal. 

R = XR:  +IR-  for  i=l ...  m andj=m+l,n  (3.3) 

s 1 j 

where  V(Rk)<Rmax  for  k=l,...  ,n  . 

As  a design  metric,  the  response  time  of  a system  dictates  the  amount  of  system  processing  power.  A system 
that  requires  a response  in  10  minutes  has  much  different  requirements  than  a system  limited  by  1 second  response 
times.  However,  system  response  time  does  not  directly  correlate  to  performance  at  lower  levels.  An  overall  10  minute 
system  response  time  may  still  require  millisecond  updates  at  the  lowest  level  to  achieve  real-time  control.  In  effect, 
system  response  time  gives  a general  metric  to  evaluate  timing. 

3.2  Low-Level  Timing  Requirements 

The  required  system  response  time  restricts  the  maximum  response  time  imposed  on  each  level.  Meeting  this 
timing  restriction  may  not  allow  some  levels  enough  processing  time  for  sophisticated  control.  This  leads  to  the 
motivation  to  divide  task  decomposition  into  planners  and  executors.  Now,  each  level  can  maintain  real-time  control. 
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yet  concurrently  evaluate  alternative  future  actions.  An  initial  assumption  is  that  t1^^,  the  time  the  executor  runs  each 


cycle,  is  less  than  the  level  response  time  limit  maximum;  otherwise  the  system  response  time  must  be  increased  or  a 
finer  resolution  of  task  decomposition  must  be  attempted. 


t < R 
exec  max 


(3.4) 


Response  time  for  any  level  i,  previously  defined  as  R-,  depends  on  the  timing  constraints  of  the  level  in  the 
hierarchy.  R-  is  composed  of  texec  , the  time  each  level  spends  running  the  executor,  tcomm,  the  time  spent 
communicating/waiting  for  data,  and  tresjcjuap  time  remaining  before  the  next  cycle.  This  leads  to  the  following 
definition  for  R.  response  time  for  the  ith  level. 


R = t1  + t1  + S'* 

i exec  comm  residual  v * ’ 

Equation  3.5  leads  to  some  observations.  If  the  response  time  of  the  level  is  very  small,  say  1 millisecond,  this 

leaves  very  little  time  to  do  both  processing  and  communicating.  Finally,  the  amount  of  residual  time  per 

level  dictates  whether  the  executor,  the  planner  and  other  processes  can  run  concurrently  as  tightly-coupled  processes 
on  one  processor  or  must  run  loosely-coupled  on  different  processors.  Each  of  these  observations  will  be  explored 
further. 


At  the  lower  levels  of  the  system,  performance  must  be  fast  and  predictable.  An  example  is  the  servo  loop  at  the 
motor  level.  These  operations  must  operate  with  short  response  times  and  guaranteed  update  rates.  For  example,  in 
order  for  a robot  to  exhibit  smooth  motion,  control  updates  to  the  arm  must  be  supplied  within  a set  time  linked  to  the 
physical  hardware  capabilities.  A control  cycle  executing  sufficiently  fast  will  provide  motion  control  that  will  appear 
continuous  and  thus  real-time.  Smooth  motion  results  from  an  efficient  input-compute-output  cycle  that  maintains  a 

small  standard  deviation  each  cycle  with  an  upper  bound  a on  variance.  Let  us  define  the  residual  time  tlresj(jua]  as 

the  time  remaining  in  a cycle  after  the  executor  is  finished.  This  implies  that  not  only  must  be  greater  that 

zero,  but  must  be  bounded  by  a sum  of  the  worst  case  times  for  both  processing  and  communication.  This  leads  to  the 
following  constraint. 


0 < t 


residual 


a < R.  - t 


i,max 


exec 


i,max 


comm 


where  t1,max  exec  = max(t1,Jexec)  j=  1,...  «,  cycles 


,i,max 


f ’*  = max(tlx*  m ) j=  1,...  00  cycles 

comm  comm7  J 


This  relation  sets  a limit  on  the  t1rp<;^lia1  time  available  to  other  background  processes  each 

implies  that  adding  further  capabilities  with  a small  residual  time  will  require  additional 

will  adversely  affect  t*  „ , the  time  for  communications. 

J comm 


(3.6) 


cycle.  This  further 
processors,  which 


In  practice,  response  times  at  the  lowest  levels  are  very  fast  with  minimal  time  devoted  to  communication  and 
planning.  There  are  numerous  examples  that  illustrate  this  principle  with  efficient,  multiprocessor  designs  at  low 
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levels  of  the  hierarchy  [8, 14, 19,  20, 22]. 


3 J Higher-Level  Timing  Requirements 

Moving  up  in  the  hierarchy,  timing  constraints  are  relaxed  and  leads  to  an  increase  in  R-,  level  response  time. 

The  increase  in  R-  means  that  the  communications  and  executor  components  of  a virtual  control  cycle  are  a smaller 

percentage  of  the  levels  processing  activity.  Increased  time  at  higher  levels  allows  slower  options  to  execute  in  the 
background,  such  as  secondary  data  storage,  multi-tasking,  and  character  input  and  output  for  a user  interface.  In  fact, 
loosely-coupled  machines  could  conceivably  link  together  in  a distributed  control  system  at  the  very  highest  levels  for 
systems  with  a moderating  response  time. 

In  the  middle  of  the  hierarchy  the  concern  for  processor  power  remains.  A planner  and  executor  tightly-coupled 
on  the  same  processor  is  desirable  for  its  simplicity  and  the  ease  on  communication  requirements.  The  decision  to 
partition  a control  level  (i.e.  executor  and  planning  processes)  as  tightly-coupled  on  one  processor  or  loosely-coupled 
on  parallel  processors  depends  on  the  constraints  of  timing  and  available  computing  resources. 

In  order  to  analyze  the  relationship  between  the  planner  and  executor  in  an  tightly-coupled  environment,  several 
assumptions  need  to  be  established.  First,  levels  execute  every  cycle  and  these  cycles  are  of  fixed  time  c.  This  leads 
to  the  following  equality  of  time  cycles  throughout  the  system. 

t i i = t-  + c i=  0,...  °°  cycles  (3.7) 

Second,  once  the  executor  has  completed  one  cycle,  it  must  wait  for  the  next  clock  cycle  before  processing 
again.  At  higher  levels,  sufficient  residual  time  allows  for  the  executor  to  block  and  wait  until  the  next  time  cycle 
before  processing.  For  lower  levels,  the  executor  could  busy/wait  and  poll  on  a system  clock  awaiting  the  next  clock 
cycle  in  cases  where  the  time  for  a context  switch  between  tasks  may  be  too  large  a percentage  of  cycle  operation. 


Third,  the  concept  of  blocked,  running,  waiting  processes  implies  the  use  of  multitasking.  In  order  to  achieve 
real-time  performance,  the  multi-tasking  scheduling  need  not  be  fair,  as  in  a general-purpose  operating  system. 
Instead,  the  multitasking  must  allow  assignment  of  priorities  to  tasks.  The  requirement  for  priorities  leads  to  the 
assumption  that  the  executor  is  of  higher  priority  than  the  planner.  The  higher  priority  of  the  executor  preserves  the 
system  response  requirement. 


Fourth,  communication  between  the  planner  and  the  executor  depends  on  non-interruptible  critical  sections. 
This  is  important  since  some  multi-tasking  systems  lack  a non-preemptive  feature. 

Given  these  assumptions  for  tightly-coupled  execution,  further  timing  constraints  include  context  switching 
overhead,  critical  section  support,  and  general  planner  throughput  requirements.  Whether  tight-coupled  execution  is 

advisable  remains  to  be  determined  First,  the  planning  phase  of  each  cycle  provided  only  a small  t1resj(jua|,  residual 

time  processing  and  would  preclude  a planner  from  formulating  a plan  in  a reasonable  amount  of  time.  The 
requirement  that  the  planner  must  finish  any  plan  within  d cycles  should  be  imposed  on  a level.  This  assumption  leads 

to  the  constraint  that  a plan  take  d1  cycles  to  finish  for  the  ith  level, 


/ d1  < t1 


2 t. 


cs 


(3.8) 


plan  ' t residual 

where  each  cycle  using  tresj(juaj  minus  two  times  the  tcs  a context  switching  amount  of  processing  time. 
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If  this  condition  cannot  be  held,  then  a multiprocessor  planner  and  executor  level  should  be  used.  Assuming 

the  condition  to  finish  a plan  within  d1  cycles  does  hold,  more  detailed  constraints  need  resolving.  First,  response  time 
needs  reevaluation.  Therefore,  including  the  planner  on  the  same  processor  adds  two  extra  context  switches  to  a cycle, 
represented  in  time  t , plus  increases  the  amount  of  communication  time  to  include  both  interlevel  communication 

t'il-comm  communication  between  the  planner  and  executor  t*e  p.comm-  This  has  the  corresponding  effect  on 
response  time: 


o _ tl  i x + 2 t 

i exec  comm  plan  + residual  cs 


where  t* „ = t1-,  + t1  _ _ • 

comm  ll.comm  e.p-comm 


(3.9) 


Since  the  planner  and  executor  modules  share  data,  whenever  the  planner  must  update  the  plan  graph,  it  enters 
a critical  section  whereby  it  cannot  be  preempted.  This  update  may  occur  every  cycle  but  it  must  be  accounted  for  in 
the  worst  case  scenario.  A problem  may  arise  for  plan  updating  of  a large  amount  of  data  because  the  amount  of  time 
allotted  for  the  critical  section  must  only  be  a fractional  amount  of  the  time  used  by  the  executor  each  cycle.  This  leads 
to  the  constraint  that  the  planner  to  executor  communication  is  small. 


e.p-comm 


« 


R. 

l 


(3.10) 


If  the  planners  cannot  meet  this  constraint,  a parallel  planner  and  executor  should  be  used  and  interprocessor 
communication  between  the  processes  should  use  a scheme  that  incorporates  time  slices  or  double-buffers  for  plan 
update.  Satisfying  this  plan  updating  constraint  heeds  one  cautionary  note.  Should  a critical  section  in  one  cycle 
overlap  into  the  next  cycle,  the  executor  would  wait  until  the  planner  completes  updating  a portion  of  the  plan  graph 
before  continuing  processing.  Thus,  mutual  exclusion  requires  a non-preemptable  multitasking  scheduler  to  prevent 
preemption  of  the  lower  priority  planner  in  its  critical  section.  Summarizing,  the  following  features  must  be  available 
for  an  tightly-coupled  planner/executor. 


• high  priority  processes  (i.e.,  executors)  run  every  cycle  at  a fixed  interval  responding  to  current 
control  requirements,  and 

• lower  priority  processes  (i.e.,  planners)  run  in  the  background  anticipating  future  control 
requirements. 

• low  priority  processes  allow  critical  sections  to  run  to  completion  (i.e.,  non -preemptive)  but  must 
be  only  a fractional  time  portion  of  higher-priority  processes  operating  cycle 


3.4  Algorithm  Timing  Analysis 

The  timing  constraints  of  the  control  level  algorithm  outlined  in  the  previous  section  can  be  studied  using  a 
Gantt  chart  notation.  For  the  sake  of  accounting,  the  algorithm  will  be  divided  into  functions.  These  functions  will  be 
assigned  arbitrary  timing  constraints  for  the  purposes  of  analysis.  R will  represent  the  function  for  an  interprocessor 
read  of  a command  and  status  and  will  take  1 ms.  W will  represent  the  function  for  an  interprocessor  write  of  a 
command  and  status,  and  will  take  1 ms.  X will  represent  the  current  level  executor  processing  and  will  range  from  1 
to  3 ms  computation  intervals.  P will  represent  the  planner  function  and  will  require  1 to  6 ms  before  updating  a plan. 
Plans  must  be  updated  within  16  ms  or  at  worst  an  updated  plan  should  be  available  every  third  cycle.  U will  represent 
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the  critical  section  where  the  planner  updates  the  current  plan  graph.  For  simplicity,  operating  system  computation 
costs  such  as  context  switches  will  be  ignored.  Initially,  a response  time  of  8 ms  will  be  considered.  For  this  case,  all 
the  functions  can  run  tightly-coupled  on  one  processor  in  one  cycle,  with  sufficient  residual  processing  time  as  shown 
by  the  Gantt  chart  in  figure  2. 


W 


W 


U 


8 ms 


16  ms 


cycle  1 


cycle  2 


Figure  2.  Control  Level  Computation  of  Single  Processor  Concurrent  Execution 


Notice  that  the  algorithm  repeats  ad  infinitum  the  basic  "RXWPU"  computational  pattern.  The  planner  runs  at 
least  3 ms.  each  cycle  which  implies  that  after  two  cycles  6 ms.  will  have  been  allotted  to  the  planner  with  1 ms.  for 
plan  updating.  More  stringent  timing  constraints  may  degrade  system  performance  beyond  an  acceptable  level.  For 
example,  a 6 ms.  response  time  allows  insufficient  time  for  planning,  so  that  within  3 cycles  no  updated  plan  exists. 
Figure  3 shows  the  Gantt  chart  for  this  situation. 
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cycle  2 


cycle  3 


Figure  3.  Control  Level  Failing  to  Meet  Planning  Time  Constraints 

The  lack  of  computational  resources  for  planning  suggests  the  partitioning  of  the  problem  across  processors. 
Now,  processor  one  (PI)  will  have  exclusive  execution,  but  will  have  the  additional  timing  constraint  of  interprocessor 
communication  with  the  planner.  The  function  C will  represent  the  interprocessor  communication  overhead.  C 
requires  1 ms.  With  excess  computational  power,  the  processor  must  now  idle  between  cycles  represented  by  a dash 
(i.e.,  -).  The  executor  repeats  ad  infinitum  a "CRXW-"  computational  pattern.  Processor  two  (P2)  will  execute 
planning  exclusively.  Computational  resources  easily  guarantee  6 ms.  planning  to  generate  updated  plans  with  12  ms. 
As  a side  effect  of  excess  computational  power,  the  planner  may  have  to  wait  for  fresh  information  from  R,  an  executor 
status  read.  (Assume  for  the  discussion  that  shared  R status  information  is  in  a critical  section  and  read-only  for  the 
planner.)  The  Gantt  chart  in  figure  4 shows  the  execution  scheme  in  parallel  processors. 
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Figure  4.  Control  Level  Computational  of  Dual-Processor  Execution 

In  the  parallel  design,  the  control  level  effectively  meets  all  timing  constraints  with  some  computational  waste. 
However,  this  solution  can  be  sensitive  to  increases  in  complexity  of  the  executor  (X ) portion  of  the  control  cycle.  For 
example,  if  the  function  X is  replaced  by  a newer  version  which  requires  4 ms.  a cycle,  then  the  above  functional 
allocation  across  processors  will  not  work.  For  in  the  worst  case,  there  is  no  time  for  interprocessor  communication 
with  the  planner  (C).  In  this  case,  faster  hardware  or  a more  refined  task  decomposition  strategy  must  be  used. 


4.  Application  of  Concurrent  Hierarchical  Control  and  Comparisons  to  Other  Models 

The  concurrent  hierarchical  model  of  a robot  control  system  has  been  previously  implemented  with  a purely 
executor  style  of  task  decomposition  for  a robot  control  system  at  the  National  Institute  of  Standards  and  Technology 
[3].  The  flow  of  control  featured  state-table  transitions.  Where  planning  was  appropriate,  static  plan  definitions  were 
used.  The  system  offered  several  benefits.  First,  the  system  was  sensory-interactive  and  adapted  to  perturbations  in  the 
environment  in  real-time  even  though  the  world  model  was  limited  to  basic  feature  recognition.  Second,  interface 
standards  were  proposed  as  a consequence  of  the  well-defined  interfaces  of  hierarchical  decomposition  [6].  Third,  the 
communicating  sequential  levels  aided  information-hiding  and  insulated  software  on  one  level  from  changes  on 
another  level.  Fourth,  high-frequency  feedback  from  the  periodic  sampling  of  commands  and  status  ensured 
predictability  necessary  for  a reliable  and  safe  robot  controller. 

The  NASA/NBS  Standard  Reference  Model  for  Telerobot  Control  System  Architecture  (NASREM)  is  the 
basis  for  the  current  robot  control  system  under  development  [2].  The  concurrent  hierarchy  decomposes  into  these 
levels:  task,  elemental  move,  primitive,  and  servo.  NASREM  extends  the  concurrent  hierarchical  model  to  examine 
such  research  issues  as  task  planning,  extensive  world  modeling  including  maps,  object  definitions  and  feature 
recognition.  To  maintain  cost,  flexibility  and  portability,  multiprocessors  communicating  across  a shared  backplane 
have  been  chosen  for  the  hardware.  Concurrency  is  supported  as  both  loosely-coupled  on  multiprocessors  and  tightly- 
coupled  on  a single  processor.  Software  development  is  based  on  a real-time,  cross-target  ADA  environment  extended 
to  meet  real-time  scheduling  priorities. 

The  real-time  system  consists  of  a series  of  Motorola  68020  processor  boards  linked  with  the  VME  bus.  Each 
processor  board  of  the  target  system  contains  a small,  fast,  multitasking,  real-time  executive  that  guarantees 
predictable  execution  times.  For  the  real-time  executive,  task-switching  is  approximately  100  microseconds.  A system 
clock  synchronizes  software  execution  on  all  processor  boards.  Layered  on  top  of  this  real-time  kernel  is  a shared 
memory  reader/writer  scheme  used  for  inter-level  concurrent  communication.  The  host-target  operator  interface 
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communication  is  linked  using  a VME  bus  to  VME  bus  adapter.  The  hosts  provide  non-real-time  services,  such  a 
programming  development,  user-interface,  or  graphic  diagnostic  display. 


The  differing  response  times  between  levels  affects  granularity.  At  the  lower  primitive  level  [2],  the  planner  is 
responsible  for  generating  a series  of  trajectory  points  in  the  background.  In  the  foreground,  the  executor  runs  at  50  Hz 
and  supplies  new  joint  positions  to  the  servo  level  at  each  clock  cycle.  At  the  servo  level,  the  robot  receives  torque 
commands  at  updates  of  400  Hz.  Wheatley  offers  a more  detailed  examination  of  the  timing  constraints  at  the  lower 
levels  [24].  At  higher  levels,  completion  time  for  plans  is  less  critical.  Multitasking  on  a single  processor  supplies 
enough  processing  p>ower  for  both  the  planner  and  executor.  Figure  5 provides  a summary  of  the  implementation 
design  for  planning  and  execution  within  the  concurrent  hierarchical  model  based  on  timing  constraints  at  NIST. 
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Figure  5.  Concurrent  Resource  Requirements  Based  on  Cycle  Response  Time 

4.1  Comparison  of  Concurrent  Hierarchical  Control  to  Other  Models 

A survey  of  existing  robot  control  systems  provides  numerous  concurrent  models.  The  requirement  to  satisfy 
real-time  constraints  has  led  to  numerous  implementations  using  robot  control  architectures  with  several  processor 
boards  on  a common  bus  [8, 12, 13, 19, 20, 22].  These  architectures  share  common  characteristics  with  the  concurrent, 
hierarchical  control  model.  Although  few  of  these  control  systems  are  directly  labeled  as  a hierarchical,  most  systems 
have  at  least  a two-level  hierarchy,  with  a low-level,  real-time  subsystem  handling  real-time  motion  control,  and  a 
slower  high-level  subsystem  responsible  for  planning.  For  example,  the  following  research  models  use  the  two-level 
hierarchy.  The  CMU  CHIMERA  model  p>artitions  concurrency  as  a two  layer  model,  low-level  real-time  motion 
control  connected  to  a non-real-time  AML/X  command  environment  [2].  The  MIT  CONDOR  project  has  a real-time 
system  consisting  of  a series  of  Motorola  68020  based  boards  connected  by  VME  bus  to  VME  bus  adapter  to  a SUN- 
3 workstation  controlling  the  Utah/MIT  hand  [19].  The  University  of  Pennsylvania  has  a robot  coordinator  issue  force 
and  motion  commands  to  a real-time  robot  force  and  motion  server  [20].  Brown  University  uses  a series  of  high-sp>eed 
networks  connecting  a real-time  servo  systems  and  general  computing  resources  [12].  IBM  partitions  concurrency  into 
real-time  and  non-real-time  by  connecting  the  real-time  system  to  the  programming  system  by  a real-time  bridge  [13]. 
IBM  extends  the  exchange  of  commands  and  status  to  include  varying  frequencies;  such  as  every  cycle,  every  nth 
invocation  or  asynchronously. 

The  RCA-McGill  RCCL  model  uses  a 3 level  concurrency  hierarchy  that  supports  a planning  level,  a trajectory 
level,  and  a joint-control  level.  The  joint  control  level  operates  at  1 kHz  and  assigns  1 processor  p>er  joint.  The  Fine 
motion  trajectory  level  runs  a 50  Hz  and  runs  as  high-priority  cyclic  task.  Concurrently,  a rough  motion  planner 
asynchronously  produces  a queue  of  motion  requests  for  the  trajectory  level  as  a background  task.  In  RCCL,  the 
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trajectory  level  acts  like  an  executor. 


The  HIC  model  is  an  operating  system  based  on  a hierarchy  of  servo  loops  for  controlling  the  Utah/MIT  hand 
[4].  This  model  emphasis  layers  of  servo  loops  with  deterministic  scheduling  as  the  main  control  process  and  uses 
periodic  data  buffers  for  communication.  The  periodic  data  sampling  of  the  model  supports  the  deterministic  flavor  of 
concurrent  hierarchical  processing  model. 

Implicitly,  these  real-time  models  may  have  additional  levels  within  the  hierarchy,  but  further  layering  does  not 
directly  correspond  to  a virtual  control  loop  implementation.  This  is  due  in  part  to  the  concept  of  one  return  status  per 
procedure  call  of  serial  execution.  For  example,  a Cartesian  trajectory  plan  in  a sequential  robot  language  such  as 
AML/X  must  be  decomposed  into  a series  of  kinematic  poses  for  the  robot.  This  process  is  implicitly  hierarchical. 
These  implicit  levels  could  be  partitioned  into  separate  concurrent  processes  that  communicate  through  established 
interfaces  rather  than  serial  routines  that  communicate  via  subroutine  calls.  Further,  with  concurrent  operation,  the 
concept  of  sampling  the  return  status,  not  just  receiving  a final  status,  embodies  the  feedback  nature  of  a virtual  control 
loop. 

5.  Conclusion 

Concurrent  hierarchical  structuring  is  a model  that  not  only  satisfies  robot  control  timing  constraints,  but 
provides  predictable  system  performance.  Hierarchical  structuring  uses  task  decomposition  to  create  the  level  of 
granularity  necessary  to  meet  the  strict  real-world  timing  requirement  at  the  lowest  level  while  offering  a convenient 
software  development  tool.  A hierarchical  control  system  defined  with  task  decomposition  offers  information  hiding 
between  levels  that  can  lead  to  well-defined,  and  eventually  standardized  interfaces.  The  basis  of  concurrency  within 
the  hierarchical  model  is  the  virtual  control  loop.  The  regulated  and  predictable  behavior  of  the  virtual  control  loop 
leads  to  timing  metrics  guaranteeing  performance.  Predictable  response  time,  especially  important  as  complexity 
increases,  is  a key  for  more  reliable  and  safe  robot  controls  systems. 
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